® 



® 



& 



Euro|0Ische8 Patentamt 
European Patent Office 
Office europ6en des brevets 



© PubHcation number: 



0 372 734 

A1 



® Application number: 89311830.7 
(g) Date of filing: 15.11.89 



EUROPEAN PATENT APPLICATION 

® intClAGIOL 5/04 



@ Priority: 23,11.88 US 275581 

@ Date of publication of application: 
13.06.90 Bulletin 90)24 

@ Designated Contracting States: 

AT BE CH DE ES FR GB GR IT LI LU NL SE 



® Applicant: DIGITAL EQUIPMENT 
CORPORATION 
146 Main Street 
Maynard, MA 01754(US) 

@ Inventor Vltale, Anthony John 
22 Saint James Drive 
Northoborogh Massachusetts 01532(US) 
Inventor: Levergood, Thomas MarK 
75 Blaclcstone Street 
Belllngham Massachusetts 02019(US) 
Inventor: Conroy, David Gerard 
78 Concord Street 
Maynard Massachusetts 01754(US) 



© Representative: Hale, Peter et ai 
ICilbum & Strode 30 John Street 
London WC1N 2DD(GB) 



® Name pronunciation by synthesizer. 

® An apparatus and method for correctly pronouncing proper names from text using a computer Provides a 
diiionary whi^ perfomis an initial search for the name. If the name is not in the dictionary, it »s sent to a fi^er 
which eiLr positively identities a single language group or eliminates one or more language g^o"P^^s^« 
ranguage group of origin for that word. When the filter cannot positively Identify the language group of c.,g.n for 
the name a list of possible language groups is sent to a grapheme analyzer. Using grapheme analysis, ttie mos^ 
probable language group of origin for the name Is detemiined and sent to a language-sensltlve letter-to-sound 
section. In this Action, the name is compared with language-sensltive rules to provide accurate Phonem.cs and 
stress Information for the name. The phonemics (including stress infomiation) are sent to a voice realizaton unit 
for audio output of the name. 
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NAME PRONUNCIATION BY SYNTHESIZER 



The present invention relates to text-to-speech conversion by a computer, and specifically to correctly 
pronouncing proper names from text. 

Name pronunciation may be used in the area of field service within the telephone and computer 
industries. It is also found within larger corporations having reverse directory assistance (number to name) 
5 as well as In text-messaging systems where the last name field Is a common entity. 

There are many device commercially available which synthesize American English speech by com- 
puter. One of the functions sought for speech synthesis which presents special problems is the pronunci- 
ation of an unlimited number of ethnically diverse surnames. Due to the extremely large number of different 
sumames in an ethnically diverse country such as the United States, the pronouncing of a sumame cannot 
70 be practically implemented at present by use of other voice output technologies such as audiotape or 
digitized stored voice. 

There is typically an inverse relation between the pronunciation accuracy of a speech synthesizer in its 
source language and the pronunciation accuracy of the same synthesizer in a second language. The United 
States is an ethnically heterogeneous and diverse country with names deriving from languages which range 

75 from the common Indo-European ones such as French, Italian, Polish, Spanish, Genman, Irish, etc. to more 
exotic ones such as Japanese. Armenian, Chinese, Arabic, and Vietnamese. The pronunciation of surnames 
from the various ethnic groups does not conform to the rules of standard American English. For example, 
most Germanic names are stressed on the first syllable, whereas Japanese and Spanish names tend to 
have penultimate stress, and French names, final stress. Similarly, tiie ortiiographic sequence CH is 

20 pronounced [c] in English names (e.g. CHILDERS), [s] in French names such as CHARPENTIER. and [k] In 
Italian names such as BRONCHETTI. Human speakers often provide correct pronunciation by "knowing" 
tiie language of origin of the name. The problem faced by a voice syntiiesizer is speaking these names 
using the conrect pronunciation, but since computers do not ''know" the etiinic origin of the name, that 
pronunciation is often incorrect. 

25 A system has been proposed in the prior art in which a name is first matched against a numt)er of 
entries in a dictionary which contains the most common names from a number of different language groups. 
Each dictionary entry contains an orthographic form and a phonetic equivalent. If a match occurs, the 
phonetic equivalent is sent to a synthesizer which turns it into an audible pronunciation for that name. 

When the name is not found in tine dictionary, tiie proposed system used a statistical trigram model. 

30 This trigram analysis involved estimating a probability that each three letter sequence (or trigram) In a name 
is associated witii an etymology. When the program saw a new word, a statistical fomnula was applied In 
order to estimate for each etymology a probability based on each of the tiiree letter sequences (trigrams) In 
the word. 

The problem witii this approach is the accuracy of the trigram analysis. This is because tiie trigram 
35 analysis computes only a probability, and with all language groups being considered as a possible 

candidate for the language group of origin of a word, the accuracy of the selection of the language group of 

origin of the word is not as high as when there are fewer possible candidates. 

According to one aspect of the present invention there is provided a method for positively identifying or 

eliminating a language group as a language group of origin for a given word, comprising: 
40 comparing substi-ings of graphemes of an input word to a stored set of filter rules until either a match of one 

of the substrings to one of the filter rules positively identifies a language group, or any language group is 

eliminated when a match of one of the substrings to one of tiie filter rules indicates a language group is 

eliminated from consideration as a language group of origin for the input word; and 

producing a list of possible non-ellmlnated language groups of origin when no language group is positively 
45 identified as tiie language group of origin or indicating ttie language group of origin when tiie language 

group of origin is positively identified. 

According to anotiier aspect of tiie present invention tiiere is provided a method for generating conrect 

phonemics for a given input word according to a language group of origins of tiie Input word, the method 

comprising: 

50 filtering the input word in a filter to identify a language group of origin for the input word or to eliminate at 
least one language group of origin for the input word; 

sending ttie input wonj and a language tag indicating a language group of origin for the input word from the 
filter to a letter-to-sound module containing letter-to-sound rules when the filter positively Identifies a 
language group of origin for the input word; 

sending from the filter the input word and any non-eliminated language groups to a grapheme analyser 
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when a language group of origin for the input word is not positively identified by the fitter; 

producing a most probable language group of origin for the input word by analysing graphemes in the input 

word; 

sending the Input word and the most probable language group of origin to a subset of the letter-to-sound 
5 module con^esponding to the most probable language group; 

producing in the subset of letter-to-sound module segmental phonemics for the input word; 

sending the segmental phonemics and the language tag from the letter-to-sound module to a stress 

assignment section; 

producing stress assignment Information for the input word In the stress assignment section; and 
70 sending the segmental phonemics and the stress assignment infomnation to a voice realisation unit 

According to this aspect there is also provided apparatus for positively identifying or eliminating a 
language group as a language group or origin for a given word, comprising: 

a filter mle store which stores a set of filter rules, a first subset of the filter rules positively identifying a 
language group, and a second subset of the filter mies eliminating a language group: 

IS a comparator which compares substrings of graphemes of an Input word to the first and second subsets of 
filter rules until a match of one of the substrings to one of the first subset of filter rules positively identifies a 
language group or eliminates any language group when a match of one of the substrings to one of the 
second subset of filter rules indicates a language group is eliminated from consideration as a language 
group of origin for the input word; and 

20 an output which produces a list of possible language groups of origin when no language group is positively 
identified as the language group of origin, and which produces an indication of the language group of origin 
when the language group of origin is positively identified. 

The present invention solves the above problem by improving the accuracy of the trigram analysis. This 
is done by providing a filter which either positively identifies a language group as a language group of 

25 origin, or eliminates a language group as a language group of origin for a given input word. The filtering 
method according to the present Invention comprises Identifying or eliminating a language group as a 
language group of origin for an input word according to a stored set of filter rules. The step of identifying or 
eliminating a language group includes performing an exhaustive search of the rule set using a right-to-left 
scan. Language groups are eliminated when a match of one of these substrings to one of the fitter rules 

30 indicates that a language group should be eliminated from consideration as the language group of origin for 
the input word. This is done until a match of one of the substrings to one of the rules positively Identifies a 
language group. When no language group Is positively identified as a language group of origin after all of 
the substrings for a given input word are compared, a list of possible language groups of origin is produced. 
This filter metiiod also produces a positively identified language group of original when tiiere is a positive 

35 identification. 

The advantages of using a filter before the trigram analysis includes avoiding unnecessary trigram 
analysis when filter rules can positively identify a language group as a language group of origin. When no 
language group can be positively identified, the filtering method also reduces the chances of an incorrect 
guess being made in the trigram analysis by reducing the number of possible language groups in 

40 consideration as the language group of origin. Through the elimination of some language groups, the 
Identification of a language group of origin is more accurate, as discussed above. 

The invention also includes a method for generating correct phonemics for a given Input word 
according to the language group of origin of the input word. This method comprises searching a dictionary 
for an entry corresponding to an input word, each entry containing a word and phonemics for tiiat word. 

45 This entry is then sent to a voice realization unit for pronunciation when the dictionary search reveals an 
entry conresponding to the input word. The input word is sent to a filter when the input word does not have 
a corresponding entry in the dictionary. 

The next step in the method involves filtering to identify a language group of origin for the input word or 
to eliminate at least one language group of origin for the input word. When the filter positively identifies a 

50 language group of origin for the input word, the input word and a language tag indicating a language group 
of origin for the input word is sent from the filter to a letter-to-sound module. When a language group of 
origin is not positively identified by the filter, the Input word and any language groups not eliminated are 
sent from the filter to a trigram analyzer. 

A most probably language group of origin for the input word is produced by analyzing trigrams 

55 occurring in the input word. This most probably language group of origin produced by the trigram analysis 
is sent along with tiie Input word to a subset of letter-to-sound rules that conrespond to the most probable 
language group. Phonemics are generated for the input word according to the conresponding subset of 
letter-to-sound rules. 
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The invention is all respects also extends to a method and apparatus for speech synthesis incorporating 
the atx>ve features. The speech synthesis may include voice realization arranged to pronounce the word 
according to the determined language. 

The present invention can t>e put Into practice in various ways one of which will now be described by 
5 way of example with reference to the accompanying drawings In which: 

RGURE 1 illustrates a logic blodc diagram of language identification and phonemics realization 
modules; and 

FIGURE 2 shows a logic block diagram of a name analysis system containing the language group 
identification and phonemic realization module of Figure 1, constructed in accordance with the present 
10 invention. 

Figure 1 is a diagram illustrating the various logic blocks of the present invention. The physical 
emt)odlment of the system can be realized by a commercially available processor logically arranged as 
shown. 

A name to be pronounced is accepted as an input. The search is made through entries in a dictionary 
75 10 for this Input name. Each dictionary entry has a name and phonemics for that name. A semantic tag 
identifies the word as being a name. 

A search for an input name that corresponds to an entry in tiie dictionary 10 results in a hit The 
dictionary 10 will then immediately send the entry (name and phonemics) to a voice realisation unit 50. 
which pronounces tiie name according to the phonemics contained in the entry. The pronunciation process 
20 for that input word would then t>e complete. 

A dictionary miss occurs when ttiere is no entry corresponding to the input name in tiie dictionary 10. In 
order to provide the correct pronunciation, the system attempts to identify the language group of origin of 
the input name. This is done by sending to a filter 12 the input name which missed in tiie dictionary 10. 
The input name is analyzed by the filter 12 In order to eitiier positively identify a language group or 
25 eliminate certain language groups from further consideration. 

The filter 12 operates to filter out language groups for input names based on a predetennined set of 
rules. These rules are provided to the filter 12 by a rule store described later. 

Each input name is considered to be composed of a string of graphemes. Some strings within an input 
name will uniquely identify (or eliminate) a language group for that name. For example, according to one 
30 rule the string 6AUM positively identifies the input name as German, (e.g. TANNENBAUM). According to 
another rule the string MOTO at the end of a name positively identifies the language group as Japanese 
(e.g. KAWAMOTO). When there Is such a positive Identification, the input name and the identified language 
group (L TAG) are sent directly to a letter-to-sound section 20 that provides the proper phonemics to the 
voice realization unit 50. 

35 The filter 12 otherwise attempts to eliminate as many language groups as possible from further 
consideration when positive identification is not possible. This Increases probability accuracy of the 
remaining analysis of the.input name. For example, a filter rule provides that if the stiing -6 is at the end of 
a name, language groups such as Japanese. Slavic. French. Spanish and Irish can be eliminated from 
further consideration. By this elimination, the following analysis to determine the language group of origin 

40 for an input name not positively identified is simplified and improved. 

Assuming that no language group can be positively identified as the language group of origin by the 
filter 12, further analysis is needed. This is performed by a trigram analyzer 14 which receives the input 
name and the list of any language groups not eliminated by the filter 12. The trigram analyzer 14 parses the 
string of graphemes (the input name) Into trigrams. which are grapheme strings tiiat are tiiree graphemes 

45 long. For example, the grapheme string #SMITH# is parsed into tiie following five trigrams: #SM, SMI. MIT. 
ITH. TH#. For trigram analysis, the hash sign (word-boundary) is considered a grapheme. Therefore, the 
number of trigrams is always the same as the number of graphemes in the name. 

The probability for each of the trigrams being from a particular language group is input to the tirlgram 
analyzer 14. This probability, computed from an analysis of a name data base. Is received as an input from 

60 a frequency table of trigrams for each language group that was not eliminated by the filter 12. The same 
thing is also done for each of the other trigrams of tiie grapheme string. 

The following (partial) matiix shows sample probabilities for the sumame VITALE: 
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In the array above, L t$ a language group and n is the number of language groups not eliminated by the 
filter 12. The trigram #VI has a probability of .0679 of being from language group LI. .4659 of being from the 
language group Lj and .2093 of being from language group Ln. Li is averaged as the highest probability and 
thus the language group is identified. 

The probability of each of the trigrams of the grapheme string (input name) is similarly input to the 
trigram analyzer 14. The probability of each trigram In an input name is averaged for each language group. 
This represents the probability of the input name originating from a particular language group. The 
probability that the grapheme string #VITALE# tielongs to a particular language group is produced as a 
vector of probabilities from the total probability line. From this vector of probabilities, other items such as 
standard deviation and thresholding can also be calculated. This ensures that a single trigram cannot overly 
contribute to or distort the total probability. 

Although the illustrated embodiment analyzes trigrams, the analyzer 14 can be configured to analyze 
different length grapheme strings, such as two-grapheme or four-grapheme strings. 

In the example above, the trigram analyzer 14 shows that language group Lj Is the most probable 
language group of origin for the given input name, since It has the highest probability. It Is this most 
probable language group that becomes the L TAG for the input name. The L TAG and the input name are 
then sent to the letter-to-sound section 20 to produce the phonemtcs for the input. 

The filter rules are constructed in such a way that ambiguity of identification is not possible. That Is, a 
language may not be fc>oth eliminated and positively identified since a dominance relationship applies such 
that a positive identification is dominant over an elimination rule in the unlikely event of a conflict 

Similarly, a language group may not be positively identified for more than one language because the 
filter rules constitute an ordered set such that the first positive identification applies. 

The system may default to a certain language group if one of two thresholding criteria is met: (a) 
absolute thresholding occurs when ttie highest probability detemnined by tine trigram analyzer 14 is below a 
predetemnined threshold Ti. This would mean that tiie trigram analyzer 14 could not determine from among 
the language groups a single language group with a reasonable degree of confidence; (b) relative 
thresholding occurs when ttie difference in probabilities between tiie language group Identified as having 
the highest probability and tiie language group Identified as having tiie second highest probability falls 
below a threshold Tj as determined by the trigram analyzer 14. 

The default to a specified language group is a settable parameter. In an Engllsh-speaidng environment, 
for example, a default to an English pronunciation is generally tiie safest course since a human, given a low 
confidence level, would most likely resort to a generic English pronunciation of the Input name. The value of 
the default as a settable parameter is that the default would be changed in certain situations, for example, 
where ttie telephone exchange indicates that a telephone number is located in a relatively homogeneous 
ethnic neighboriiood. 

As mentioned earlier, tfie name and language tag (LTAQ) sent by either the filter 12 or tfie trigram 
analyzer 14 is received by the letter-to-sound rule section 20. The letter-to-sound rule section 20 is broken 
up conceptually into separate blocks for each language group. In otiier words, language group (L|) will have 
its own set of letter-to-sound rules, as does language group (Lj), language group (LO etc. to language group 
(U). 

Assuming that the input name has been identified sufficiently so as not to generate a default 
pronunciation, the input name is sent to the appropriate language group letter-to-sound block 22]^ according 
to the language tag associated with the input name. 

In the letter-to-sound rule section 20, the rules for the individual language group blocks 22 are subsets 
of a larger and more complex set of letter-to-sound rules for other language groups including English. A 
letter-to-sound block 22} for a specific language group L| that has been identified as the language group of 
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origin wHI attempt to match the largest grapheme sequence to a rule. This is different from the filter 12 
which searches top to bottom, and in this emtxKftment right to left, for the string of graphemes in an input 
name that fits a filter rule. The letter-to-sound bloclc 22^ for a specific language scans the grapheme string 
from left to right or right to left, the illustrated emtxxliment using a right to left scan. 

5 An example of the letter-to-sound rules for a specific block L| can be seen for a name such as 
MANKIEWICZ. This input name would be identHied as originating from the Slavic language group, having 
the highest probability, and would therefore be sent to the Slavic letter*to-sound rules block 22,-. In that 
block 22i, the grapheme string -WICZ has a pronunciation mle to provide the correct segmental phonemics 
of the string. However, the grapheme string -KIEWICZ also has a rule in the Slavic rule set Since this is a 

10 longer grapheme string, this rule would apply first The segmental phonemics for any remaining graphemes 
which do not correspond to a language specific pronundatioh rule will then be determined from the general 
pronunciation block. In this example, the segmental phonemics for the graphemes M. A. and N would be 
determined (separately) according to the general pronunciation mles. The letter*to-sound block 22i sends 
the concatenated phonemics of both the language-sensitive grapheme strings and the non-language- 

16 sensitive grapheme strings together to the voice realization unit 50 for pronunciation. 

The filter 12 does not contain all of the larger strings which are language specific that are in the letter- 
to-sound rules 20. The larger strings are not all needed since, for example, the string-WICZ would positively 
identify an input name as Slavic in origin. There is then no need for the string -KIEWICZ filter rule, since 
-WICZ is a subset of -KIEWiCZ and thus would identify the Input name. 

20 The letter-to-sound module outputs the phonemics for names mainly in the form of segmental 
phonemic information. The output of the letter-to-sound rule blocks 22i^ serve as the input to stress 
sections 24^* These stress sections 24}^ take the LTAG along with the phonemics produced by individual 
letter-to-sound rule blocks 22un and output a complete phonemic string containing both segmental 
phonemes (from letter-to-sound rule blocks 22i^ and the correct stress pattern for that language. For 

25 example, if the language identified for the name VITALE was Italian, and letter-to-sound rule block 22 
provided the phoneme string [vital!], then the stress section 24i would place stress on the penultimate 
syllable so that the final phonemic string would be [vitali]. 

It should be noted that the actual rules used in the filter 12. in the letter-to-sound section 20, and the 
stress sections 24{^ are rules which are either known or easily acquired by one skilled in the art of 

30 linguistics. 

The system described above can be viewed as a front end processor for a voice realization unit 50. The 
voice realization unit 50 can be a commercially avsulable unit for producing human speech from graphemic 
or phonemic input. The synthesizer can be phoneme-based or based on some other unit of sound, for 
example diphone or demi-syllable. The synthesizer can also synthesize a language other than English. 

35 Rgure 2 shows a language group identification and phonetic realization block 60 as part of a system. 
The language group identification and phonetic realization block 60 is made up of the functional blocks 
shown in Rgure 1. As shown, the input to the language identification and phonetic realization block 60 is the 
name, the filter rules and the trigram probabilities. The output is the name, the language tag and 
phonemics. which are sent to the voice realization unit 50. It should t>e noted that phonemics means in this 

40 context, any alphat>et of sound symbols including diphones and deml-syilables. 

The system according to Figure 2 marks grapheme strings as belonging to a particular language group. 
The language identifier is used to pre-filter a new data base in order to refine ttie probability table to a 
particular data base. The analysis block 62 receives as inputs the name artd language tag and statistics 
from the language Identification and phonetic realization block 60. The analysis block takes this information 

45 and outputs the name and language tag to a master language file 64 and produces rules to a filter rule store 
68. In this way, the data base of tiie system is expanded as new input names are processed so that future 
input names will be more easily processed. The filter mle store 68 provides the filter rules to the filter 12 
and the language identification and phonetic realization block 60. 

The master file contains all grapheme strings and ttieir language group tag. This block 64 is produced 

50 by the analysis block 62. The trigram probabilities are arranged in a data structure 66 designed for ease of 
searching for a given input trigram. For example, the illustrated embodiment uses an N-deep tiiree 
dimensional matrix where n Is the number of language groups. 

Trigram probability tables are computed from the master file using the following algorithm: 

55 
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compute total ntuober of occurrences of each trlgram for 
all language groups L (1-N) ; 

for all grapheme strings S in L 
® for air trigrams T in S 

if (count [T][L] = 0) 
uniq [L] + = l 
,0 count (T3 [L] + « 1 

for all possible trigrams T in master 
sum 8 0 

IS 

for all language groups L 

sum + « count [T] [L]/uniq[L] 
for all language groups L 
20 if sum >0,probCT] [L]=count [T] [L]/uniq[L]/sum 

else prob[T] [L]=0.0; 

The trigram frequency table mentioned earlier can be thought of as a three-dimensional array of 
trigrams, language groups and frequencies. Frequencies means the percentage of occurrence of those 
trigram sequences for the respective language groups based on a large sample of names. The probability 
of a trigram being a member of a particular language group can be derived in a number of ways. In this 
embodiment, the probability of a trigram being a member of a particular language group is derived from the 
well-known Bayes theorem, according to the fomnula set forth below: 
^ Bayes* Rule states that the probability that Bj occurs given A, P{Bj|A). Is 

P(BjlA) 





B-j)P(B-j) 


P(A 


Bi)P(Bi) 



35 

More specific to tlie problem, tiie probability a language group given a trigram, T, is P(UjT). where 

P(Li|T) = P(TlLi)P(Li) 
40 P(T|I^)P(I^) 
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analyzing further 

pcnu) = ? 

where X = number of times the token. T, occurred in the language group, Li 
Y = number of uniquely occurring tokens In the language group. Li 
P(L|) = h always 

where N = number of language groups (nonoverlapping) 



P(TlL .) P{T|L.) 
P(L.|T) = N ^ 



N N 

PfTiL , ) PfTlL ^) 
k=l N k=l 
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The final table then has four dimensions; one for each grapheme of the trigram. and one for the 
language group. 

The trigram probabilities as computed by the block 66 are sent to the language identification and 
phonetic realization block 60. and particularly to the trigram analyzer 14 which produces tiie vector of 
5 probabilities that the grapheme string belongs to a particular language group. 

Using the above-described system, names can be more accurately pronounced. Further developments 
such as using ttie first name in conjunction witii the surname in order to pronounce the surname more 
accurately are contemplated. This would involve expanding the existing knowledge base and rule sets. 

70 

Claims 

1. A method for positively identifying or eliminating a language group (l-i...Lii) as a language group of 
origin for a given word, comprising: 

IS comparing substrings of graphemes of an input word to a stored set of filter rules until either a match of one 
of the substrings to one of tiie filter rules positively identifies a language group, or any language group is 
eliminated when a match of one of the substrings to one of the filter rules Indicates a language group is 
eliminated from consideration as a language group of origin for the input word: and 
producing a list of possible non-eliminated language groups of origin when no language group is positively 

20 identified as tiie language group of origin or indicating the language group of origin when the language 
group of origin is positively identified. 

2. A method as claimed in claim 1, wherein said comparing step includes ttie step of searching the filter 
rules from top to bottom and right to left. 

3. A method as claimed in claim 1 , wherein tiie comparing step includes the step of searching the filter 
25 rules by language group and by grapheme within each language group. 

4. A method for generating connect phonemlcs for a given input word according to a language group of 
origins of the Input word, the method comprising: 

filtering the input word in a fitter (12) to identify a language group of origin for the Input word or to eliminate 
at least one language group of origin for the Input word: 
30 sending the input word and a language tag indicating a language group of origin for the input word from the 
filter to a letter-to-sound module (22) containing letter-to-sound rules when tiie filter positively identifies a 
language group of origin for tiie input word; 

sending from the filter the input word and any non-eliminated language groups to a grapheme analyser (14) 
when a language group of origin for the input word is not positively identified by the filter; 
35 producing a most probable language group of origin for the input word by analysing graphemes in tiie input 
word; 

sending the input word and the most probable language group of origin to a subset of the letter-to-sound 
module conresponding to the most probable language group; 

producing in the subset of letter-to-sound module segmental phonemics for the input word: 
40 sending the segmental phonemics and the language tag from the letter-to-sound module to a stress 
assignment section (24); 

producing stress assignment Information for the input word in the stress assignment section; and 
sending the segmental phonemics and the stress assignment information to a voice realisation unit (50). 

5. A method as claimed in claim 4. wherein the graphemes are trigrams. 

46 6. A method as claimed in claim 4 or 5, wherein the step of producing a most probable language group 
of origin includes the step of computing probabilities of graphemes for an input word t>eing from a particular 
language group using Bayes* Rule. 

7. A mettiod as claimed In claim 4. 5 or 6, further comprising the step of defaulting to a general 
pronunciation when the step of producing a most probable language group of origin produces a most 

50 probable language group of origin having a probability below a predetermined threshold level. 

8. A method as claimed In claim 4. 5, 6 or 7, further comprising the step of defaulting to a general 
pronunciation when the step of producing a most probable language group of origin produces a most 
probable language group of origin having a probability tiiat is not greater by a predetermined amount than a 
probability of a next most probable language group of origin. 

55 9. A method as claimed in any of claims 4 to 8 including first searching a dictionary (10) for an entry 
corresponding to the Input word, each entry containing a word and phonemics for that word; and 
sending an entry to the voice realisation unit for pronunciation when the dictionary searching reveals that 
entry conresponding to the input words. 
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10. Apparatus for positively identifying or eliminating a language group (U-Ui) as a language group or 
origin for a given word, comprising: 

a fitter rule store (68) which stores a set of filter rules, a first subset of the filter rules positively Identifying a 
language group, and a second subset of the filter rules eliminating a language group; 

6 a comparator (12) which compares substrings of graphemes of an input word to the first and second 
subsets of filter rules until a match of one of the substrings to one of the first subset of filter rules positively 
identifies a language group or eliminates any language group when a match of one of the substrings to one 
of the second subset of filter rules Indicates a language group is eliminated from cortsideration as a 
language group of origin for the Input word; and 

10 an output which produces a list of possible language groups of origin when no language group is positively 
Identified as the language group of origin, and which produces an Indication of the language group of origin 
when the language group of origin is positively identified. 

11. Apparatus as claimed in claim 10 Including an analyser (14) for calculating the most prolDable 
language group of origin for the graphemes in the given word for each language not eliminated by the 

75 second subset of the filter mles received from the output. 

12. Apparatus as claimed in claim 11 In which the analyser analyses graphemes in the given word 
arranged into trigrams 
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ABSTRACT OF THE DISCLOSURE 



An apparatus and method for correctly pronouncing proper names 
from text using a computer provides a dictionary which performs 
an initial search for the name. If the name is not in the 
dictionary, it is sent to a filter which either positively 
identifies a single language group or eliminates one or more 
language groups as the language group of origin for that word. 
When the filter cannot positively identify the language group 
' of origin for the name, a list of possible language groups is 
sent ^o a grapheme smalyzer. Using grapheme analysis, the most 
probable language group of origin for the name is determined 
and sent to a language-sensitive letter-to-sound section. In 
this section, the name is compared with language-sensitive 
rules to provide accurate phonemics and stress information for 
the name. The phonemics (including stress information) are 
sent to a voice realization unit for audio output of the name. 
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